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Abstract 

The goal of this paper is to analyse a method of validating a subset of gestures 
to be used as elements of a HCI interface. We investigate the applicability of 
LDA for gesture data dimensionality reduction. An Gesture mutual separability 
analysis of a diverse dataset of 22 natural gestures captured with two motion- 
capture devices is provided. Fisher criterion is used to produce measures of class 
separability and class overlap. 

Keywords: LDA, gestures, separability, HCI, motion-capture 



1. Introduction 

With a widespread use of motion-tracking devices, both traditional e.g. com- 
puter mouse and new Nintendo Wii Remote"^^, cell phone accelerometer ar- 
rays, the importance of motion-based interfaces in Human-Computer Interac- 
tion (HCI) systems became unquestionable. The commercial success of simple 
motion-capture devices led to the development of more robust and versatile 
acquisition systems, both mechanical e.g. Cyberglove Systems Cyberglove"'"'^, 
Measurand ShapeWrap™, DGTech DGSVHand'^'^ and optical e.g. Microsoft 
Kinect"'''^, Asus WAVI Xtion"'"'^. Past years brought also an increased interest 
in the analysis of a human motion itself [TOl HH d] . 

While modern motion-capture systems provide accurate recordings of a hu- 
man body movement, creation of a HCI interface based on acquired data is 
not a trivial task. The presence of noise in the data as well as its large di- 
mensionality makes them difficult to analyse. Additionally, the hand movement 
during the execution of a particular gesture, performed by different subjects 
may vary significantly. Some gestures may become unrecognisable with respect 
to a particular capturing device. 

An human computer interface based on the broad range of natural human 
gestures, represents the most demanding requirement. But due to the fact 
that recognition of certain gestures by the computer might be a difficult task, 
a limited subset of human gestures can be selected by the interface designer. 

Simple motion-based interfaces limit their elements to a subset of artificial, 
well distinguishable gestures or just detection of a presence of body motion. 
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Therefore, an additional challenge lies in creating an interface based on gestures 
that are also perceived as natural by users. A choice of a gesture subset for 
a HCI interface, to be considered natural, should be based on a subjective 
user convenience of use. However, a developer needs an objective measure of 
suitability of a gesture for the interface. For a HCI interface element, such 
measure should be related to a difficulty of gesture classification. It should also 
be independent from a choice of capturing device and classification method. 

Since quality of classification is closely related to the distinctiveness of a clas- 
sified pattern, this paper considers the problem of finding a gesture separability 
measure and detection of overlap between gesture classes in the acquired data. 
In our works we concentrated on hand gestures, captured with two mechani- 
cal motion-capture systems. We used a diverse gesture database of twenty two 
natural gestures performed by a number of participants with varying execution 
speeds [6|. 

Looking for a reliable separability measure we decided to use Linear Discrim- 
ination Analysis (LDA). While this method has some limitations, particularly 
regarding similarity of class covariance matrices, it has proved itself to produce 
good results for many applications including face recognition [T3] and speech 
detection [8]. 

To reduce an initial dimensionality of the data, Principal Component Anal- 
ysis (PCA) technique is often employed before performing LDA. However, as 
suggested by [13], a potential problem lies in an incompatibility of PCA and 
LDA criterion, when PCA discards dimensions that contain important discrim- 
inative information. Since gesture classification is often based on small but 
significant differences in gesture patterns, we decided to limit the initial data 
processing to simple, essential operations. 

The paper is organized as follows. Section 2 (Related work) presents the 
selection of works on similar subjects. Section 3 (Method) describes the ex- 
periment. Results and charts are presented in Section 4 (Results). Section 5 
(Discussion) provides author remarks on the subject, while Section 6 (Conclu- 
sion) concludes the research. 

2. Related work 

In [9], authors provide an analysis of LDA and PCA algorithm with a dis- 
cussion about their performance for the purpose of object recognition. Authors 
present results of experiments using a face image database. 

In [T^, authors use LDA-based feature extraction techniques for face recog- 
nition. Authors discuss the problem of a classifier becoming overfitted to the 
training set which leads to discarding useful discriminative information. An ap- 
proach using random subspace and bagging is proposed to create a robust face 
recognition system. 

In the paper [12| a motion-capture system based on a data glove, used for 
dynamic signature verification is described. The technique used by authors is 
based on Singular Value Decomposition (SVD) and produces an accurate rate 
of genuine-forgery detection. 
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Gesture recognition for accelerometer-based motion capture systems is pre- 
sented in [2]. Authors present an algorithm employed for cell phones. Authors 
developed a two-stage system consisting of Bayesian networks and Support Vec- 
tor Machines (SVM) to resolve confusing gesture pairs. Similar problem was 
described in |3| where the data was preprocessed using PCA to reduce a its di- 
mensionality and Hidden Markov Models (HMM) and Dynamic Time Warping 
(DTW) were employed for data classification. 

Thorough analysis of a gesture dataset used in the experiments, along with 
a discussion on the benefits of naturality of a HCI interface elements, can be 
found in [5]. PCA analysis of the same dataset together with visualization of 
eigengestures can be found in [5]. 

3. Method 

The goal of the experiment is to determine a mutual separability for a set 
of gestures, using Fisher criterion as a separability measure. In the first step, 
gesture data is projected on a lower-dimensional classification space. Then 
a mutual separability is determined for every gesture pair. 

3.1. Experiment data 

A set of twenty-two natural hand gesture classes from 'IITiS Gesture Database' 
Tab. [l] was used in the experiments. Gestures were recorded with two types of 
hardware. First one was DGTech DGSVHand"'"^^ motion capture glove |4j, con- 
taining 5 finger bend sensors (resistance type), and three- axis accelerometer pro- 
ducing three acceleration and two orientation readings. Sampling frequency was 
approximately 33 Hz. The second one was Cyberglove Systems CyberGlove'^'^ 
^ with a CyberForce"'''^ System for position and orientation measurement. The 
device produces 15 finger bend, three position and four orientation readings 
with a frequency of approximately 90 Hz. 

During the experiment, each participant was sitting at the table with the 
motion capture glove on his right hand. Before the start of the experiment, 
the hand of the participant was placed on the table in a fixed initial position. 
At the command given by the operator sitting in front of the participant, the 
participant performed the gestures. Each gesture was performed six times at 
natural pace, two times at a rapid pace and two times at a slow pace. Gestures 
number 2, 3, 7, 8, 10, 12, 13, 14, 15, 17, 18, 19, 21 are periodical and in their case 
the single performance consisted of three periods. The end of data acquisition 
was decided by the operator. 

3.2. Data preprocessing 

A motion capture recording performed with a device with m sensors gener- 
ates a time sequence of vectors Xj. g K™. For the purpose of our work each 



■"-http:/ /www. dg-tech.it/vhand 

^ http : // www.cyberglovesystems . com / products / cyberglove-ii / overview 
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Tabic 1: The 


gesture list used in experiments 




Name 


Class 


Motion* 


Comments 


1 


A-OK 


symbolic 


1 


common 'okay' gesture 


2 


Walking 


iconic 


il 


fingers depict a walking person 


3 


Cutting 


iconic 


F 


fingers portrait cutting a sheet of paper 


4 


Showe away 


iconic 


T 


hand shoves avay imaginary object 


5 


Point at self 


deictic 


RF 


finger points at the user 


6 


Thumbs up 


symbolic 


RF 


classic 'thumbs up' gesture 


7 


Crazy 


symbolic 


TRF 


symbolizes 'a crazy person' 


8 


Knocking 


iconic 


RF 


finger in knocking motion 


9 


Cutthroat 


symbolic 


TR 


common taunting gesture 


10 


Money 


symbolic 


F 


popular 'money' sign 


11 


Thumbs down 


symbolic 


RF 


classic 'thumbs down' gesture 


12 


Doubting 


symbolic 


F 


popular Polish(?) fiippant 'I doubt' 


13 


Continue 


iconic^ 


R 


circular hand motion 'continue', 'go on' 


14 


Speaking 


iconic 


F 


hand portraits a speaking mouth 


15 


Hello 


symbolic^ 


R 


greeting gesture, waving hand motion 


16 


Crasping 


manipulative 


TF 


grasping an object 


17 


Scaling 


manipulative 


F 


finger movement depicts size change 


18 


Rotating 


manipulative 


R 


hand rotation depicts object rotation 


19 


Come here 


symbolic*^ 


F 


fingers waving; 'come here' 


20 


Telephone 


symbolic 


TRF 


popular Polish(?) 'phone' depiction 


21 


Co away 


symbolic'^ 


F 


fingers waving; 'go away' 


22 


Relocate 


deictic 


TF 


'put that there' 



" We use the terms 'symbolic', 'deictic', and 'iconic' based on McNeill & Levy 1101 classification, 

supplemented with a category of 'manipulative' gestures (following 
' Significant motion components: T-hand translation, R-hand rotation, F-individual finger movement 

This gesture is usually accompanied with a specific object (deictic) reference 
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recording was linearly interpolated and re-sampled to t — 100 samples, gener- 
ating data matrices A; — [xl*"''] G K™^*, where / enumerates recordings. Then 
data matrices were normalized by computing the t-statistics 

J^^'f _ ^. 

A r _ -H ^ 

where x^, Ui are mean and standard deviation for a given sensor i taken over all 
I recording. 

Subsequently every matrix AJ for was vectorized row-by-row, so that it was 
transformed into data vector 

-'^l — [J'; I ■ ■ ■ : ■^J T ■ ■ ■ T^l 1 ■ ■ ■ 1 ■^l J : 

belonging to W^, p — rat. Then those data vectors were organized into n = 22 
classes C^. Then vectors belonging to each of the classes were horizontally 
stacked forming the set Q — {Gck ^ Rp^"*"} of data matrices. 

3.3. LDA 

Linear Discriminant Analysis — thoroughly presented in |7] — is a super- 
vised, discriminative technique producing an optimal linear classification func- 
tion, which transforms the data from p dimensional space MP into a lower- 
dimensional classification space. 

3.3.1. Two classes 

Originally the problem was formulated by Fisher for two-classes in the fol- 
lowing form. 

Lets consider two set of vectors X; = [a;|^\ . . . , x'f^'Y' , I = 1, . . . ,n belonging 
to two classes C^, k = {1, 2} whose covariance matrices are equal. The goal is to 
find the vector a e M^*, that optimally separates these classes. It can be shown 
that this vector maximizes the equation 

(a^X2 -a^xi)^ 
= ^^^Wa ' 

where Xi and X2 denote means for classes Ci , C2 respectively. W denotes within 
class covariance matrix calculated in the following way 



fc=l 



where n is number of all data vectors, is number of data vectors in class 
and Sfe is the covariance matrix for class C'k calculated from equation 



Sfc = — ^— E (xj - Xfc)(xi - Xfc)^ 
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It can be shown that a oc W~-'^(x2 — Xi). We will call the vector a the 
first canonical vector. This vector is a basis of following classification criterion. 
Given vector x we classify it to class Ci if following relation is fulfilled 

|a^x-a^xi| < |a^x-a^X2|, (2) 

otherwise we classify it to class C2. 



3.3.2. Many classes 

To find the best separation for fc-class problem, when k > 2 , vector a should 
maximize the following equation 

-.- / \ a Sa 

^™(a) = ^- (3) 

The matrix B is called the between-class scatter matrix and is calculated in the 
following way 



1 

— ^ ni{xi - x)(Xj - x)"^ 



where x denotes aggregated class mean 

1 



X = - > X, 



n 



The matrix W is called within- class scatter matrix 

fe 

n ~ k 



W=;7^E E (x.-x,)(x.-x,f, 

i=lxiGCj 



where n is number of all the samples in all the classes. 

The eigenvectors of matrix W^^B ordered by their respective eigenvalues 
are called the canonical vectors. It can be proved that the first canonical vector 
a of W^^B maximizes the expression By selecting first d canonical vectors 
and forming from them the projection matrix A''^) e R'^^p any x e M^* can be 
projected onto a lower-dimensional feature space M.'^. This projection separates 
vectors in classes. 



3.4. Gesture separability and overlap 

Our goal is to determine gesture class separability and gesture class overlap 
in function of dimensionality d of the feature space. In order to do so we calculate 
two sets of coefficients: A('''(Cfej, C^^) and 7^''^(Cfc^, Cfc^) defined below. 
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3.11- Step 1 

To reduce the dataset dimensionality, LDA is performed on the dataset Q 
and matrix A*^'') is calculated. 

The dataset Q is projected onto d-dimensional space spanned by first d 
canonical vectors using matrix A^"^). We obtain transformed dataset Q'^'^^ = 
{g[? € R'^x"'"}, where g[? = A^'^^Gc,. 

Projection of the data from W onto lower dimensional feature space M'' 
decreases the quality of data separation. But at the same time lower dimensional 
feature spaces are more desirable. Therefore to determine the appropriate value 
of d we apply following procedure. The family of reduced datasets Q'^'^\ d = 
{1, . . . ,p— 1} is subjected to LDA algorithm and using the formula from equation 
([3|, the reduced dataset separation measure Xd — Jm(a) is calculated. We look 
for such value of d that A^+i — A^; is small. After determining appropriate small 
do we use it in next step. 

34.2. Step 2 

In this step a measure of class separability A*-*'' (C^j , C^j) is obtained for 
every pair of distinct classes Cfc^, Cfc^. 

Now the reduced data form the set Cj('^o) are once more a subject to LDA 
algorithm. For every pair of distinct classes Ck^ , Cfc^ and therefore for every pair 
of corresponding data matrices G^"'' and G^^"'' we calculate class separation 

measure X^'^°^Cki,Ck2) — J2(a''^''^l) equal to the maximized value of linear 
separation, obtained from equation M, where the first canonical vector 

separating those two classes. 

34.3. Step 3 

In this step a measure of class overlap 7*'''''-*(Cfej , C^j) is obtained for every 
of pair distinct classes Ck^ , C^, . The procedure goes as follows: for every pair 
Cfej, using the first canonical vectors a'^i^'^^ from the previous step, first 
calculate Vfc, = a'^^^'^^G^^f^ and Vfe, = a'^i^'^^G^'^^j, then calculate 

7 \^k^:^koJ <y sup(vfcj - inf (vfc J otherwise, 

where Xk^jXk^ are means of Wk^^Vk^- A value of 7*^'^^(Cfcj , C^j) > 0, indi- 
cates that classes Ck^ , Ck2 are not completely separable when projected onto 
d-dimensional feature space. 

4. Results 

The results are presented for two devices DGSVHand (DG5) and CyberGlove 
(CG). 

To facihtate gesture dataset processing, in Step 1 of our algorithm, its di- 
mensionality was reduced by projecting the data from W, where p — 1000 
for DG5 and p = 2200 for CG device, onto lower dimensional feature space 
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Name 


DG5 CG Summary 


1 


A DK 


1 m 


2 




-1- + + 


Q 
O 


(libll LU 


1 m 


4 


Sh mnp nil) nil 


1 ^ 


o 


A-'/TJ t^i /IT C 1 T 
iUblli llL OCt/ 


1 m 


6 


T'h 7 / 777 f) 7/71 


-1- + + 


7 

8 


CjTfl71l 

A 77 no KIT) n 

±\. t b\J\j%\ivl 0\j 


+ + + 
-1- + + 


Q 


(Ibiiil ULLL 


1 m 


10 


A/fnn PI 1 

iV± \J 1 b\L, U 




11 


ihnmnQ finiDTi 

J. 1 bUrl 1 bUO UrU UJ 1 b 


-1- + 


12 


Doubting 


+ + + 


13 


Continue 


+ • 


14 


Speaking 




15 


Hello 


+ + + 


16 


Grasping 


+ + + 


17 


Scaling 


+ - • 


18 


Rotating 


+ • 


19 


Come here 




20 


Telephone 


+ - • 


21 


Co away 


+ • 


22 


Relocate 


+ - • 



Table 2: Concluded separability for gestures: + denotes a separable gesture, - a problematic 
gesture and •, a gesture that is problematic for only one of the tested devices. 

R'^. For both devices we calculated normalized class separation value for one- 
dimensional projection and obtained the very similar results Xd=i ~ 0.9524. 
By observing the value of A^+i — Ad, we determined that Xd=2 — Xd=i ~ 0.045 
while Xd=3 — Xd=2 < 0.003. Further increase of d leads to minimal gain of class 
separability value. Based on this observation we chose dimension do = 2 for the 
initial projection of data in Step 1. 

Projection of the experiment data on M'^ is presented in Fig. [l] Most of the 
gestures are well separated. In the majority of visible gesture classes, elements 
are centred around their respectable mean, with an almost uniform variance. 
Potential conflicts for small number of gestures may be observed for local regions 
of the projected data space. The summary of gesture suitability, as an element of 
a HCI interface, using separability criterion A was presented in Tab. [2] Gestures 
were classified as separable, when for Cfc, gesture separability A,nin > T^, where 
Td is an arbitrary value of device separability threshold and A„iin is a minimal 
value of A for Cfc. In our experiment we took thresholds Tdgs — 0.0004 and 
TcG = 0.0049. 

Tab. [3] presents gesture class pairs, where the value of class overlap 7 > 0. 
Class overlap was detected for 1.58% of analysed gestures therefore, 81.8% of 
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Ord. 


Device 


91 


92 




1 


DGSVHand'^'^ 


13 


18 


2.96 X 10" 


-5 


2 


CyberGlove^'^ 


1 


14 


2.94 X 10" 


-3 


3 


CyberGlove^'^ 


5 


20 


1.58 X 10" 


-3 


4 


CyberGloveTM 


22 


17 


4.54 X 10" 


-3 



Table 3: Overlaping gesture pairs ({91,92} : 79192 > 0) 



tested classes are completely separable. Small number of conflicts in class data 
indicates a potential good performance of a gesture classifier based on analysed 
dataset. 

It is not surprising that for devices using optical tapes and accelerometer 
for data acquisition, high separability of a gesture seems to be associated with 
an active, fast hand movement eg. Hello (15), Doubting (12), and repeated use 
of individual fingers eg. Walking (2), Knocking (8). Indistinguishable gestures 
usually employ a wrist movement and unrestricted position of fingers eg. Showe 
away (4), Continue (13), Go away (21). 

In Fig. [2] relatively higher separability values for CyberGlove"^^can be ob- 
served. However, more instances of class overlap was detected for this de- 
vice. This problem may be related to the arm mount, used to acquire hand 
movement and orientation readings. While its readings are more precise that 
DG5VHand"^^accelerometer array, the mount slightly restricts arm movement, 
which results in more cautious gesture performing and may hinder an execution 
of particular gestures. 

5. Conclusion 

One of the key requirements of an effective HCI is to allow the user to con- 
centrate on the task that is being carried out, not on the interface elements 
or interaction mechanics. Actual gesture recognition rate is crucial for this, as 
recognition errors focus user's attention on the interaction, and away from the 
objective. We argue that separability of a gesture is important, yet underval- 
ued measure of its distinctiveness from other patterns, and thus its potential 
performance. 

LDA provides a well documented measure of separability that can be used 
for choosing a well separable gesture data set for a HCI interface. Despite it's 
limitations Fisher criterion provides satisfactory results for analysis of a motion- 
capture data. 
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Figure 1: LDA of a dataset Q. The data is projected on d = 2 first eigenvectors of W ^B. 
Devices: DG5VHandTM(a), CyberGlove^Mdata (b). 
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Figure 2: Graphical representation of separability matrix for (a) DG5VHand and (b) Cy- 
berglove. Each plot represents a row of the matrix. Plots are scaled according to maximal 
value indicated in the upper-left corner of each of the plots. Higher value indicates better 
separability. j^g 



